This report summarises the outcomes of a systematic literature search to identify Bayesian network models used to support decision making in healthcare. After describing the search methodology, the selected research papers are briefly reviewed, with the view to identify publicly available models and datasets that are well suited to analysis using the causal interventional analysis software tool developed in Wang B, Lyle C, Kwiatkowska M (2021). Finally, an experimental evaluation of applying the software on a selection of models is carried out and preliminary results are reported.
translated by 谷歌翻译
贝叶斯结构学习允许人们对负责生成给定数据的因果定向无环图(DAG)捕获不确定性。在这项工作中,我们提出了结构学习(信任)的可疗法不确定性,这是近似后推理的框架,依赖于概率回路作为我们后验信仰的表示。与基于样本的后近似值相反,我们的表示可以捕获一个更丰富的DAG空间,同时也能够通过一系列有用的推理查询来仔细地理解不确定性。我们从经验上展示了如何将概率回路用作结构学习方法的增强表示,从而改善了推断结构和后部不确定性的质量。有条件查询的实验结果进一步证明了信任的表示能力的实际实用性。
translated by 谷歌翻译
人工智能的神经符号方法将神经网络与经典的象征技术结合起来,正在逐渐突出,需要正式的方法来推理其正确性。我们提出了一种新型的建模形式主义,称为神经符号并发随机游戏(NS-CSGS),该游戏包括在共享的连续状态环境中相互作用的概率有限状态的概率有限状态,通过以神经网络实现的感知机制观察到。由于环境状态空间是连续的,因此我们专注于具有Borel状态空间的NS-CSG类。我们考虑了零和折扣累积奖励的问题,并证明了在Borel可测量性和对模型组件的分段限制下NS-CSG的价值的存在。从算法的角度来看,计算CSG的值和最佳策略的现有方法集中在有限状态空间上。我们首次介绍可实施的价值迭代和政策迭代算法,以求解一类无数状态空间CSG,即NS-CSG,并证明其收敛性。我们的方法通过利用基础游戏结构,然后制定NS-CSG的价值函数和策略的分段线性或恒定表示。我们通过将价值迭代的原型实施应用于动态的停车案例研究来说明我们的方法。
translated by 谷歌翻译
日益增长的证据表明,由于NLP研究界的很大一部分,已经采用了最初被引入图像的对抗稳健性的经典概念。我们表明,在NLP的背景下,这一概念是有问题的,因为它考虑了一个狭隘的语言现象。在本文中,我们争论语义稳健性,与语言忠诚的人类概念更好。我们在偏差方面表征了语义稳健性,预计它将在模型中诱导。我们使用基于模板的生成试验台研究了一系列香草和强大的训练架构的语义稳健性。我们补充了经验证据的分析,尽管较难实施,语义稳健性可以提高性能%,为复杂语言现象提供保证,其中模型在经典感觉中的稳健失败。
translated by 谷歌翻译
Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers for self-driving cars, this raises concerns about their safety. We develop a novel automated verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT). We focus on safety of image classification decisions with respect to image manipulations, such as scratches or changes to camera angle or lighting conditions that would result in the same class being assigned by a human, and define safety for an individual decision in terms of invariance of the classification within a small neighbourhood of the original image. We enable exhaustive search of the region by employing discretisation, and propagate the analysis layer by layer. Our method works directly with the network code and, in contrast to existing methods, can guarantee that adversarial examples, if they exist, are found for the given region and family of manipulations. If found, adversarial examples can be shown to human testers and/or used to fine-tune the network. We implement the techniques using Z3 and evaluate them on state-of-the-art networks, including regularised and deep learning networks. We also compare against existing techniques to search for adversarial examples and estimate network robustness.
translated by 谷歌翻译
Data scarcity is one of the main issues with the end-to-end approach for Speech Translation, as compared to the cascaded one. Although most data resources for Speech Translation are originally document-level, they offer a sentence-level view, which can be directly used during training. But this sentence-level view is single and static, potentially limiting the utility of the data. Our proposed data augmentation method SegAugment challenges this idea and aims to increase data availability by providing multiple alternative sentence-level views of a dataset. Our method heavily relies on an Audio Segmentation system to re-segment the speech of each document, after which we obtain the target text with alignment methods. The Audio Segmentation system can be parameterized with different length constraints, thus giving us access to multiple and diverse sentence-level views for each document. Experiments in MuST-C show consistent gains across 8 language pairs, with an average increase of 2.2 BLEU points, and up to 4.7 BLEU for lower-resource scenarios in mTEDx. Additionally, we find that SegAugment is also applicable to purely sentence-level data, as in CoVoST, and that it enables Speech Translation models to completely close the gap between the gold and automatic segmentation at inference time.
translated by 谷歌翻译
While the problem of hallucinations in neural machine translation has long been recognized, so far the progress on its alleviation is very little. Indeed, recently it turned out that without artificially encouraging models to hallucinate, previously existing methods fall short and even the standard sequence log-probability is more informative. It means that characteristics internal to the model can give much more information than we expect, and before using external models and measures, we first need to ask: how far can we go if we use nothing but the translation model itself ? We propose to use a method that evaluates the percentage of the source contribution to a generated translation. Intuitively, hallucinations are translations "detached" from the source, hence they can be identified by low source contribution. This method improves detection accuracy for the most severe hallucinations by a factor of 2 and is able to alleviate hallucinations at test time on par with the previous best approach that relies on external models. Next, if we move away from internal model characteristics and allow external tools, we show that using sentence similarity from cross-lingual embeddings further improves these results.
translated by 谷歌翻译
End-to-End speech-to-speech translation (S2ST) is generally evaluated with text-based metrics. This means that generated speech has to be automatically transcribed, making the evaluation dependent on the availability and quality of automatic speech recognition (ASR) systems. In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems. BLASER leverages a multilingual multimodal encoder to directly encode the speech segments for source input, translation output and reference into a shared embedding space and computes a score of the translation quality that can be used as a proxy to human evaluation. To evaluate our approach, we construct training and evaluation sets from more than 40k human annotations covering seven language directions. The best results of BLASER are achieved by training with supervision from human rating scores. We show that when evaluated at the sentence level, BLASER correlates significantly better with human judgment compared to ASR-dependent metrics including ASR-SENTBLEU in all translation directions and ASR-COMET in five of them. Our analysis shows combining speech and text as inputs to BLASER does not increase the correlation with human scores, but best correlations are achieved when using speech, which motivates the goal of our research. Moreover, we show that using ASR for references is detrimental for text-based metrics.
translated by 谷歌翻译
We consider the problem of decision-making under uncertainty in an environment with safety constraints. Many business and industrial applications rely on real-time optimization with changing inputs to improve key performance indicators. In the case of unknown environmental characteristics, real-time optimization becomes challenging, particularly for the satisfaction of safety constraints. We propose the ARTEO algorithm, where we cast multi-armed bandits as a mathematical programming problem subject to safety constraints and learn the environmental characteristics through changes in optimization inputs and through exploration. We quantify the uncertainty in unknown characteristics by using Gaussian processes and incorporate it into the utility function as a contribution which drives exploration. We adaptively control the size of this contribution using a heuristic in accordance with the requirements of the environment. We guarantee the safety of our algorithm with a high probability through confidence bounds constructed under the regularity assumptions of Gaussian processes. Compared to existing safe-learning approaches, our algorithm does not require an exclusive exploration phase and follows the optimization goals even in the explored points, which makes it suitable for safety-critical systems. We demonstrate the safety and efficiency of our approach with two experiments: an industrial process and an online bid optimization benchmark problem.
translated by 谷歌翻译
In this paper, negatively inclined buoyant jets, which appear during the discharge of wastewater from processes such as desalination, are observed. To minimize harmful effects and assess environmental impact, a detailed numerical investigation is necessary. The selection of appropriate geometry and working conditions for minimizing such effects often requires numerous experiments and numerical simulations. For this reason, the application of machine learning models is proposed. Several models including Support Vector Regression, Artificial Neural Networks, Random Forests, XGBoost, CatBoost and LightGBM were trained. The dataset was built with numerous OpenFOAM simulations, which were validated by experimental data from previous research. The best prediction was obtained by Artificial Neural Network with an average of R2 0.98 and RMSE 0.28. In order to understand the working of the machine learning model and the influence of all parameters on the geometrical characteristics of inclined buoyant jets, the SHAP feature interpretation method was used.
translated by 谷歌翻译